Skip to content

Conversation

@Pavan-Microsoft
Copy link
Contributor

@Pavan-Microsoft Pavan-Microsoft commented Sep 29, 2025

Purpose

This pull request introduces significant improvements to the Azure Cognitive Search integration, focusing on enhanced chunking, embedding, and metadata handling for documents. The main changes include the addition of a custom WebApiSkill and supporting backend function to combine page texts with chunk numbers, updates to the skillset and indexer to leverage this new structure, and several workflow and documentation updates to support Azure AD authentication and region compatibility.

Azure Cognitive Search Pipeline Enhancements

  • Added a new Azure Function combine_pages_and_chunknos (code/backend/batch/combine_pages_chunknos.py) and registered it in the backend app to support a custom WebApiSkill for combining pages and chunk numbers in the skillset. [1] [2] [3]
  • Updated the skillset pipeline to:
    • Output both pages and chunk_nos from the split skill.
    • Add a custom WebApiSkill that combines these arrays into a new pages_with_chunks structure.
    • Change the embedding skill to operate on pages_with_chunks/*/page_text instead of just pages.
    • Add a ShaperSkill to structure metadata for each chunk.
    • Update index projections to use the new chunked structure and include metadata objects. [1] [2] [3]
  • Modified the indexer to use a base64 encoding mapping function for the document ID.

Azure AD Authentication and Workflow Improvements

  • Refactored CI workflow to support Azure AD authentication for PostgreSQL:
    • Removed hardcoded credentials and switched to using Azure AD tokens via azure-identity.
    • Updated environment variables and outputs to include principal details and resource group info.
    • Improved resource group creation logic and notification steps. [1] [2] [3] [4] [5] [6] [7]
  • Updated Makefile to remove hardcoded PostgreSQL credentials and ensure correct environment selection for destroy operations. [1] [2] [3]

Documentation and Configuration Updates

  • Added a list of supported Azure regions to the README.md and clarified deployment instructions regarding region and location selection.
  • Updated azure.yaml to require a minimum azd version for compatibility.
  • Changed the release workflow trigger to run after "Validate Deployment" instead of "CI".

These changes collectively improve document chunking, embedding accuracy, and metadata management in the search pipeline, while also strengthening security and deployment reliability.

Does this introduce a breaking change?

  • Yes
  • No

How to Test

  • Get the code
git clone [repo-address]
cd [repo-name]
git checkout [branch-name]
npm install

What to Check

Verify that the team integration, deployment and pipeline.

Roopan-Microsoft and others added 30 commits November 25, 2024 16:02
Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Ross Smith <[email protected]>
Co-authored-by: gpickett <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Prajwal D C <[email protected]>
…nd Update Conversation flow based on template selection (#1567)

Co-authored-by: Pavan Kumar <v-kupavan.microsoft.com>
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pavan-Microsoft <[email protected]>
Prajwal-Microsoft and others added 7 commits September 24, 2025 22:04
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Pavan-Microsoft <[email protected]>
Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Ajit Padhi <[email protected]>
Co-authored-by: Roopan P M <[email protected]>
Co-authored-by: Ross Smith <[email protected]>
Co-authored-by: gpickett <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Harmanpreet-Microsoft <[email protected]>
Co-authored-by: UtkarshMishra-Microsoft <[email protected]>
Co-authored-by: Priyanka-Microsoft <[email protected]>
Co-authored-by: Prasanjeet-Microsoft <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kiran-Siluveru-Microsoft <[email protected]>
Co-authored-by: Prashant-Microsoft <[email protected]>
Co-authored-by: Rohini-Microsoft <[email protected]>
Co-authored-by: Avijit-Microsoft <[email protected]>
Co-authored-by: RaviKiran-Microsoft <[email protected]>
Co-authored-by: Somesh Joshi <[email protected]>
Co-authored-by: Himanshi Agrawal <[email protected]>
Co-authored-by: pradeepjha-microsoft <[email protected]>
Co-authored-by: Harmanpreet Kaur <[email protected]>
Co-authored-by: Bangarraju-Microsoft <[email protected]>
Co-authored-by: Harsh-Microsoft <[email protected]>
Co-authored-by: Kanchan-Microsoft <[email protected]>
Co-authored-by: Cristopher Coronado <[email protected]>
Co-authored-by: Cristopher Coronado Moreira <[email protected]>
Co-authored-by: Vamshi-Microsoft <[email protected]>
Co-authored-by: Thanusree-Microsoft <[email protected]>
Co-authored-by: Niraj Chaudhari (Persistent Systems Inc) <[email protected]>
Co-authored-by: Rohini-Microsoft <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Ajit Padhi <[email protected]>
Co-authored-by: Roopan P M <[email protected]>
Co-authored-by: Ross Smith <[email protected]>
Co-authored-by: gpickett <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Prajwal D C <[email protected]>
Co-authored-by: Harmanpreet-Microsoft <[email protected]>
Co-authored-by: UtkarshMishra-Microsoft <[email protected]>
Co-authored-by: Priyanka-Microsoft <[email protected]>
Co-authored-by: Prasanjeet-Microsoft <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kiran-Siluveru-Microsoft <[email protected]>
Co-authored-by: Prashant-Microsoft <[email protected]>
Co-authored-by: Rohini-Microsoft <[email protected]>
Co-authored-by: Avijit-Microsoft <[email protected]>
Co-authored-by: RaviKiran-Microsoft <[email protected]>
Co-authored-by: Somesh Joshi <[email protected]>
Co-authored-by: Himanshi Agrawal <[email protected]>
Co-authored-by: pradeepjha-microsoft <[email protected]>
Co-authored-by: Harmanpreet Kaur <[email protected]>
Co-authored-by: Bangarraju-Microsoft <[email protected]>
Co-authored-by: Harsh-Microsoft <[email protected]>
Co-authored-by: Kanchan-Microsoft <[email protected]>
Co-authored-by: Cristopher Coronado <[email protected]>
Co-authored-by: Cristopher Coronado Moreira <[email protected]>
Co-authored-by: Vamshi-Microsoft <[email protected]>
Co-authored-by: Thanusree-Microsoft <[email protected]>
Co-authored-by: Niraj Chaudhari (Persistent Systems Inc) <[email protected]>
Co-authored-by: Rohini-Microsoft <[email protected]>
…, and enhance local Teams dev setup (#1925)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Ajit Padhi <[email protected]>
Co-authored-by: Roopan P M <[email protected]>
Co-authored-by: Ross Smith <[email protected]>
Co-authored-by: gpickett <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Prajwal D C <[email protected]>
Co-authored-by: Harmanpreet-Microsoft <[email protected]>
Co-authored-by: UtkarshMishra-Microsoft <[email protected]>
Co-authored-by: Priyanka-Microsoft <[email protected]>
Co-authored-by: Prasanjeet-Microsoft <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kiran-Siluveru-Microsoft <[email protected]>
Co-authored-by: Prashant-Microsoft <[email protected]>
Co-authored-by: Rohini-Microsoft <[email protected]>
Co-authored-by: Avijit-Microsoft <[email protected]>
Co-authored-by: RaviKiran-Microsoft <[email protected]>
Co-authored-by: Somesh Joshi <[email protected]>
Co-authored-by: Himanshi Agrawal <[email protected]>
Co-authored-by: pradeepjha-microsoft <[email protected]>
Co-authored-by: Harmanpreet Kaur <[email protected]>
Co-authored-by: Bangarraju-Microsoft <[email protected]>
Co-authored-by: Harsh-Microsoft <[email protected]>
Co-authored-by: Kanchan-Microsoft <[email protected]>
Co-authored-by: Cristopher Coronado <[email protected]>
Co-authored-by: Cristopher Coronado Moreira <[email protected]>
Co-authored-by: Vamshi-Microsoft <[email protected]>
Co-authored-by: Thanusree-Microsoft <[email protected]>
Co-authored-by: Niraj Chaudhari (Persistent Systems Inc) <[email protected]>
Co-authored-by: Rohini-Microsoft <[email protected]>
…nt (#1926)

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Roopan-Microsoft <[email protected]>
Co-authored-by: Ajit Padhi <[email protected]>
Co-authored-by: Roopan P M <[email protected]>
Co-authored-by: Ross Smith <[email protected]>
Co-authored-by: gpickett <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Francia Riesco <[email protected]>
Co-authored-by: Prajwal D C <[email protected]>
Co-authored-by: Harmanpreet-Microsoft <[email protected]>
Co-authored-by: UtkarshMishra-Microsoft <[email protected]>
Co-authored-by: Priyanka-Microsoft <[email protected]>
Co-authored-by: Prasanjeet-Microsoft <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kiran-Siluveru-Microsoft <[email protected]>
Co-authored-by: Prashant-Microsoft <[email protected]>
Co-authored-by: Rohini-Microsoft <[email protected]>
Co-authored-by: Avijit-Microsoft <[email protected]>
Co-authored-by: RaviKiran-Microsoft <[email protected]>
Co-authored-by: Somesh Joshi <[email protected]>
Co-authored-by: Himanshi Agrawal <[email protected]>
Co-authored-by: pradeepjha-microsoft <[email protected]>
Co-authored-by: Harmanpreet Kaur <[email protected]>
Co-authored-by: Bangarraju-Microsoft <[email protected]>
Co-authored-by: Harsh-Microsoft <[email protected]>
Co-authored-by: Kanchan-Microsoft <[email protected]>
Co-authored-by: Cristopher Coronado <[email protected]>
Co-authored-by: Cristopher Coronado Moreira <[email protected]>
Co-authored-by: Vamshi-Microsoft <[email protected]>
Co-authored-by: Thanusree-Microsoft <[email protected]>
Co-authored-by: Niraj Chaudhari (Persistent Systems Inc) <[email protected]>
Co-authored-by: Rohini-Microsoft <[email protected]>
@Pavan-Microsoft Pavan-Microsoft changed the title fix: Fix team integration citation issue and pipeline issue fix: resolve team integration citation, BYOD flow, and pipeline issues Oct 3, 2025
@Roopan-Microsoft Roopan-Microsoft added this pull request to the merge queue Oct 3, 2025
Merged via the queue into main with commit 0b6b7ad Oct 3, 2025
20 of 22 checks passed
@github-actions
Copy link

github-actions bot commented Oct 6, 2025

🎉 This PR is included in version 1.16.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.